Automating Quarto reports with parameters

Author

Jadey Ryan

Published

May 27, 2024

Data professionals transform raw data into actionable insights, which are then communicated to decision-makers in reports. Variations of the report may be generated for alternative analyses, time periods, regions, or other groupings of the data.

There is a spectrum of ways to produce these report variations with varying levels of automation:

  1. Fully manual: run the analysis one variation at a time in Excel, R, or Python, and then manually embed the tables and figures output into each report.

  2. Mostly manual: use literate programming with Quarto to weave together the code and narrative into a separate Quarto document for each report variation.

  3. Mostly automated: parameterize a Quarto document by defining and using computational parameters, and then generate each report variation by changing the parameters and rendering.

  4. Fully automated: programmatically render all report variations at once by running an R or Bash/shell script to render the parameterized Quarto template with each of the defined parameters.

If your current workflow is similar to the Fully manual or Mostly manual options above, consider how you might feel if asked to regenerate all the report variations after an update to the data. How much time and tedious labor would it take and how many potential copy/paste errors would arise?

Instead, imagine the Mostly automated or Fully automated options in which you update the data and re-render all reports from a parameterized Quarto template. Think of how much more time you could have to work on higher impact projects!

A real-life application: automated custom soil health reports

Over 1,000 soil samples have been collected and analyzed to develop a baseline of soil health in Washington State as part of the State of the Soils Assessment. Since 2020, more than 300 participating farmers have received customized soil health reports, intended to help them access, understand, and translate their soil data into informed management decisions.

Parameterized reports with Quarto allowed us to automate the process of dynamically generating hundreds of customized reports in two formats: interactive HTML and printable PDF. Without the powerful capabilities of Quarto, we would not have had the staff capacity to create these comprehensive reports in both formats.

Watch how over 14 reports are automatically generated one after another from an R script. Notice the ETA of 17 minutes shown in the progress bar (a feature of purrr::pwalk()):

To learn more about this project, see the slides or watch my posit::conf(2023) talk: Parameterized Quarto Reports Improve Understanding of Soil Health.

We also turned this Quarto project into an R package called soils, which you can learn about in this blog post or this webinar.

Parameterized reports 101

Parameterized reports are similar to functions, where the Quarto template document (.qmd file) is the function, the parameter is the input, and the report variations are the output. You can have as few or as many parameters as you like.

Parameters can be character, integer, numeric, or logical variables. The custom soil health reports used two parameters: producer_id <chr> and year <int>. These parameters were used in the report headings as inline code, and in code chunks to filter the data and highlight only the farmer’s data. See all the source code for the soil health reports in the soils GitHub repository.

Define and access parameters

Parameters are defined and accessed differently depending on whether you use the Knitr or Jupyter engine.

Define the parameters in the YAML header with default values:

---
params:
  year: 2023
  producer_id: ABC01
---

Access the parameters in the params list object:

```{r}
filtered <- data |> 
  dplyr::filter(year == params$year && producer_id == params$producer_id)
```

Designate a cell at the top of the document with the tag parameters and provide default values:

```{python}
#| tags: [parameters]

year = 2023
producer_id = 'ABC01'
```

Access the parameters by name:

```{python}
import pandas as pd

filtered_data = data[(data['year'] == year) & (data['producer_id'] == producer_id)]
```

Rendering

Single report with default parameters

The Render button in RStudio, Quarto: Preview in VS Code, or the Cmd/Ctrl + Shift + K keyboard shortcut will render and preview the report with the default parameters. You can also use the Render on Save button in RStudio or option in VS Code to automatically re-render and preview the changes after each save.

One variation at a time (mostly automated)

There are a few ways to render a report variation, one at a time, without changing the default parameters in the YAML or parameters cell:

  1. Using the command line with the -P flag:

    Terminal
    quarto render template.qmd -P year:2022 -P producer_id:XYZ01
  2. Creating a YAML file that defines the parameter values to render with and using the command line with the --execute-params flag:

    params.yml
    year: 2022
    producer_id: XYZ01
    Terminal
    quarto render template.qmd --execute-params params.yml
  3. Using the quarto::quarto_render() function with the execute_params argument:

    Console or R script
    quarto::quarto_render(
      input = "template.qmd",
      execute_params = list(
        year = 2022,
        producer_id = "XYZ01"
      )
    )

All variations at once (fully automated)

To render all report variations at once, write an R or Bash/shell script to render the parameterized Quarto template with each of the defined parameters.

  1. Create a dataframe with three columns that match the quarto::quarto_render() function arguments: output_format, output_file, and execute_params.

    Code
    data <- expand.grid(
      year = c(2022, 2023),
      producer_id = c("ABC01", "ABC02", "XYZ01", "XYZ02"), 
      stringsAsFactors = FALSE)
    
    df <- data |> 
      dplyr::mutate(
        output_format = "html",       # Output format (html, word, etc.)
        output_file = paste(          # Output file name
          year, producer_id, "report.html",
          sep = "-"
        ),
        execute_params = purrr::map2( # Named list of parameters
          producer_id, year, 
          \(producer_id, year) list(producer_id = producer_id, year = year)
        )
      ) |> 
      dplyr::select(-c(producer_id, year))
    
    df
    output_format output_file execute_params
    html 2022-ABC01-report.html ABC01, 2022
    html 2023-ABC01-report.html ABC01, 2023
    html 2022-ABC02-report.html ABC02, 2022
    html 2023-ABC02-report.html ABC02, 2023
    html 2022-XYZ01-report.html XYZ01, 2022
    html 2023-XYZ01-report.html XYZ01, 2023
    html 2022-XYZ02-report.html XYZ02, 2022
    html 2023-XYZ02-report.html XYZ02, 2023
  2. Use purrr::pwalk() to map over each row of the dataframe and render each report variation.

    purrr::pwalk(
      .l = df,                      # Dataframe to map over
      .f = quarto::quarto_render,   # Quarto render function
      input = "template.qmd"),      # Named arguments of .f
      .progress = TRUE              # Optionally, show a progress bar
    )

I’m no expert in Bash or shell scriping and prefer R all around. However, Solomon Moon’s Posit blog post on Quarto reporting infrastructure has a section on parameterized reporting that includes demo code to render all report variations in a Bash/shell script.

End-to-end workflow

Creating parameterized reports with Quarto involves a systematic approach that ensures flexibility, efficiency, and reproducibility. To help you navigate this process, let’s walk through an end-to-end workflow that transforms your reporting process from manual to fully automated. This step-by-step guide takes you from drafting a basic report template to rendering all variations at once.

For a demonstration of the workflow with example Quarto files and code, follow along with one of my parameterized reporting workshops.

  1. Write the report template with the default values hard-coded (i.e., don’t use parameters yet).

    Begin by drafting your report with fixed values. This allows you to focus on the content and structure without worrying about parameterization. Render your report to see how it looks, review it for accuracy and clarity, and make any necessary adjustments.

  2. Once satisfied with the single variation of the report, parameterize it.

    Once you’re happy with the initial version, it’s time to introduce parameters. Define your parameters and set default values. In Knitr, this is done in the YAML header, whereas Jupyter uses a designated parameters cell.

    Replace your hard-coded values with the corresponding parameter names. This transforms your static report into a flexible template in which the parameter names are placeholders to be replaced with the parameter values when the report is rendered.

  3. Render the single report with the default parameters.

    Render the newly parameterized report. Review the output to ensure everything works as expected. Make any necessary adjustments to refine your report.

  4. Change the parameters to extreme cases and render various scenarios.

    To ensure robustness, test your report with extreme parameter values. This could mean rendering a report with minimal data, maximal data, or using the boundary values for your parameters. This step helps identify any formatting or presentation issues that might arise in edge cases

    For example, I tested my edge cases by rendering reports for a farmer who had only one sample and a farmer who had ten samples. This helped me identify and correct awkward page breaks for reports with more than eight samples.

  5. Render all variations of the report at once.

    Finally, when you’re confident that your report handles all scenarios gracefully, render all desired variations with an R or Bash/shell script.

Parameters in R Markdown

The concept of parameterized reports may already be a familiar concept to R Markdown users. In fact, there is a very nice quality-of-life feature for R Markdown parameters not yet implemented for Quarto: .Rmd documents with parameters have a a Knit with Parameters GUI built with Shiny miniUI.

Figure from R Markdown: The Definitive Guide (Xie et al. 2023).

The current workaround is to build a web app to get the input, serialize to the YAML header, and then render, as described in this GitHub discussion. If a Parameters GUI for Quarto seems useful to you, upvote quarto-dev/quarto-r issue #132.

Why use Quarto instead of R Markdown?

If you don’t absolutely need the Parameters GUI, you may be wondering if you should use Quarto for parameterized reports instead of R Markdown. While R Markdown is not going away, Quarto is the next-generation that combines and expands upon the functionality of the R Markdown ecosystem into a single, consistent publishing system.

Moreover, Quarto is multi-language and multi-engine so not only R users benefit from parameterized reports, but also users of Python, Julia, and potentially new languages yet to be developed. Similarly, Quarto easily supports multiple report formats (PDF, Word, HTML, etc.) from a single document without the need for additional packages or unmanageable codebases.

To learn more about the differences between R Markdown and Quarto, read through the FAQ for R Markdown Users.

More resources

For more technical details, check out any of the workshops I have led on parameterized reporting with Quarto. The workshops provide more detailed information, how-to instructions and example code, and exercises with code snippets you can adapt for your own projects.

While this post talked only of parameterized reports, you can also parameterize presentations with Quarto. Jumping Rivers has a blog post demonstrating the process of defining parameters within a Quarto Reveal JS presentation.